At the same time , this paper puts forward a validity function for judging clustering in order to lead us to use it in k - nearest neighbor classification ; then introduces " generalization capability of a case " to k - nearest neighbour . according to the proposed approach , the cases with better generalization capability are maintained as the representative cases while those redundant cases found in their coverage are removed . we can find a new less but almost complete training data set , consequently reduce complexity of seeking near neighbour 針對k值的學(xué)習(xí),本文初步使用了遺傳算法選擇較優(yōu)的k值,同時總結(jié)了一種聚類有效性函數(shù),數(shù)值實驗證實了其有效性,旨在指導(dǎo)應(yīng)用于k -近鄰分類中;然后還將“擴(kuò)張能力”的概念引入k -近鄰算法,根據(jù)訓(xùn)練集例子不同的覆蓋能力,刪除冗余樣本,得到數(shù)量較小同時代表類別情況又比較完全的新的訓(xùn)練集,從而降低查找近鄰復(fù)雜性。